A high-performance reverse proxy that sits in front of FastCGI servers like PHP-FPM. Built with fasthttp for minimal overhead and optimized for low memory allocations per request.
- FastCGI protocol implemented from scratch (no
net/httpdependency in the hot path) - Robust CGI parameter construction (SERVER_NAME, REMOTE_ADDR, PATH_INFO, etc.)
- Location cache — proxy specific paths to external HTTP servers with in-memory caching
- Static responses — inline
returnbodies for paths like/robots.txt(nginxlocation { return 200 '...'; }equivalent), served with zero upstream calls and zero hot-path allocations - Authentication — HTTP Digest (RFC 7616) or Basic (RFC 7617), with an HMAC-keyed bcrypt cache for Basic. Per-location bypass. See Authentication.
- Configurable response headers — inject security headers (HSTS, X-Frame-Options, etc.) into every response
- Configurable CORS — RFC-compliant origin allowlist, preflight handling, credentials, zero-allocation hot path. See CORS.
- Memory-optimized —
sync.Poolfor hot-path buffers, zero-allocation header processing, pre-sized maps - Path traversal prevention with
filepath.Relboundary checks - httpoxy (CVE-2016-5385) protection
- SSRF guard on location cache upstreams (blocks private/loopback/metadata IPs)
- Hop-by-hop header filtering (case-insensitive)
- CGI environment key validation (single-pass byte-level filter)
- Null byte rejection in URI path and query string
- Authoritative
X-Forwarded-For/X-Real-IPinjection (client-supplied values stripped) - Configurable timeouts, body size limits, and concurrency caps
- Health endpoints:
/healthz(liveness),/readyz(readiness — probes PHP-FPM's status page),/healthz/fail(graceful drain trigger) - Graceful shutdown on SIGINT/SIGTERM, plus Kubernetes-friendly drain via
/healthz/fail(flips readiness +Connection: close)
# Build
go build -o fcgi-proxy .
# Run with defaults (listens on :8080, upstream at 127.0.0.1:9000)
./fcgi-proxy
# Run with a config file
cp config.example.json config.json
./fcgi-proxy -config config.json
# Run with CLI overrides
./fcgi-proxy -listen :9090 -address 127.0.0.1:9001 -document-root /srv/wwwThe image runs as nobody (non-root) with a read-only root filesystem.
# Build
docker build -t fcgi-proxy .
# Run (mount your config)
docker run -d \
-p 8080:8080 \
-v ./config.json:/etc/fcgi-proxy/config.json:ro \
fcgi-proxy
# Run with CLI flags
docker run -d \
-p 8080:8080 \
fcgi-proxy \
-listen :8080 \
-network tcp \
-address php-fpm:9000 \
-document-root /var/www/htmlservices:
proxy:
build: .
ports:
- "8080:8080"
command:
- -listen
- ":8080"
- -network
- tcp
- -address
- php-fpm:9000
- -document-root
- /var/www/html
depends_on:
- php-fpm
php-fpm:
image: php:8.3-fpm
volumes:
- ./www:/var/www/htmlAn example Helm chart with PHP-FPM + fcgi-proxy as a sidecar is included in deploy/helm/fcgi-proxy-example/.
# Install
helm install my-app deploy/helm/fcgi-proxy-example/
# With custom values
helm install my-app deploy/helm/fcgi-proxy-example/ \
--set replicaCount=3 \
--set proxy.port=8090 \
--set config.listen=":8090"
# Port-forward to test
kubectl port-forward svc/my-app-fcgi-proxy-example 8080:80
curl http://localhost:8080/The Helm deployment includes securityContext with runAsNonRoot, readOnlyRootFilesystem, and drop: ALL capabilities.
Configuration is loaded from a JSON file (default: config.json). CLI flags override file values. If the file does not exist, built-in defaults are used.
Copy config.example.json to config.json and edit as needed.
| Option | JSON key | CLI flag | Default | Description |
|---|---|---|---|---|
| Listen address | listen |
-listen |
:8080 |
host:port to bind the HTTP server |
| Network | network |
-network |
tcp |
FastCGI upstream network: tcp, tcp4, tcp6, unix |
| Address | address |
-address |
127.0.0.1:9000 |
FastCGI upstream address (TCP host:port or Unix socket path) |
| Document root | document_root |
-document-root |
/var/www/html |
Absolute path to the PHP document root on the upstream |
| Index file | index |
- | index.php |
Default script for non-.php URIs (front-controller pattern) |
| Dial timeout | dial_timeout |
- | 5s |
Timeout for connecting to the upstream |
| Read timeout | read_timeout |
- | 30s |
Timeout for reading the upstream response; also HTTP server read timeout |
| Write timeout | write_timeout |
- | 30s |
Timeout for writing to the upstream; also HTTP server write timeout |
| Max body size | max_body_size |
- | 10485760 (10 MB) |
Maximum request body in bytes (1 to 268435456) |
| Max concurrency | max_concurrency |
- | 1024 |
Maximum simultaneous connections (1 to 65535) |
| Pool max idle | pool_max_idle |
- | 32 |
Maximum idle connections kept in the FastCGI connection pool (1 to 1024) |
| Pool idle timeout | pool_idle_timeout |
- | 30s |
How long an idle connection can sit unused before being closed (1s to 5m) |
| Response headers | response_headers |
- | {} |
Map of headers added to every response (see below) |
| Locations | locations |
- | [] |
Per-path rules — cached upstream proxies AND inline static returns (see Locations) |
| CORS | cors |
- | {"enabled": false} |
Cross-Origin Resource Sharing config (see CORS) |
| Authentication | auth |
- | {"enabled": false} |
HTTP Digest or Basic authentication (see Authentication) |
Add custom headers to every proxied response. Uses Set semantics (overrides any same-named upstream header).
{
"response_headers": {
"X-Content-Type-Options": "nosniff",
"X-Frame-Options": "DENY",
"Strict-Transport-Security": "max-age=31536000; includeSubDomains"
}
}Header names must contain only alphanumeric characters and hyphens. Values must not contain CR, LF, or null bytes. These are validated at startup.
Note: response headers are not applied to /healthz responses (health checks go to load balancers, not browsers).
The locations[] array defines per-path rules that run before the FastCGI dispatch. Each entry must set either upstream (cached reverse proxy to an external HTTP server) or return (inline static response) — never both, never neither. Up to 100 entries total. Path matching is exact (after fasthttp URI normalization) — /foo does not match /foo/bar.
Locations are evaluated after CORS preflight / healthz and before the auth gate, so configured paths bypass authentication by design — if you need auth on a location, remove it from locations[] and handle it through the backend instead.
{
"locations": [
{
"path": "/apple-app-site-association",
"upstream": "https://assets.example.com/universal-links/apple-app-site-association",
"cache_ttl": "1h"
},
{
"path": "/robots.txt",
"return": {
"status": 200,
"body": "User-agent: *\nDisallow:",
"content_type": "text/plain"
}
}
]
}Proxy specific paths to external HTTP/HTTPS servers and cache the response. Useful for serving static assets from a CDN or a different origin without routing through PHP-FPM.
| Field | Required | Default | Description |
|---|---|---|---|
path |
yes | — | Exact path to match (must start with /) |
upstream |
yes | — | External URL to fetch from (http:// or https://) |
cache_ttl |
no | 5m |
How long to cache a successful (200) response. Go duration format. |
Behavior:
- Only
200 OKresponses are cached. Non-200 responses are not stored and result in502 Bad Gatewayto the client. - Concurrent requests for the same path are deduplicated (singleflight) — only one upstream fetch runs at a time.
- When the upstream is unreachable, a previously cached 200 response is served as a stale fallback (up to 24 hours old). Stale entries older than 24 hours are evicted from memory.
- Each cached response body is capped at 10 MB.
- Responses include an
X-Cache: HITorX-Cache: MISSheader.
Security:
- SSRF protection: the proxy blocks connections to private (RFC 1918), loopback, link-local, and cloud metadata (169.254.169.254) addresses. Hostnames are resolved and all IPs are checked before connecting.
- Redirects are limited to 5 hops per request, and the SSRF guard applies at the TCP dial layer so redirects to internal addresses are also blocked.
- Upstream URLs must not contain credentials (
user:pass@hostis rejected at config validation). - Error messages in logs have credentials stripped from URLs.
Serve a constant response for a path with no upstream call at all — the nginx location { return 200 '...'; } equivalent. Typical use cases: /robots.txt, /ads.txt, /.well-known/security.txt, a synthetic /version endpoint, or a maintenance-mode override.
{
"locations": [
{
"path": "/robots.txt",
"return": {
"status": 200,
"body": "Sitemap: https://www.example.com/sitemap.xml\nUser-agent: *\nDisallow:",
"content_type": "text/plain"
}
},
{
"path": "/favicon.ico",
"return": {
"status": 204,
"body": ""
}
},
{
"path": "/.well-known/security.txt",
"return": {
"status": 200,
"body": "Contact: mailto:security@example.com\nExpires: 2027-01-01T00:00:00Z",
"content_type": "text/plain; charset=utf-8"
}
}
]
}| Field | Required | Default | Description |
|---|---|---|---|
path |
yes | — | Exact path to match (must start with /) |
return.status |
no | 200 |
HTTP status code. Must be between 100 and 599. |
return.body |
yes | — | Response body. Maximum 64 KiB; anything larger belongs on a real upstream. May be empty (e.g. 204 No Content). |
return.content_type |
no | text/plain; charset=utf-8 |
Content-Type header value. Must not contain CR, LF, or NUL. |
cache_ttl |
— | — | Rejected at config load when return is set — no cache is involved. |
Behavior:
- Body bytes are materialized once at config parse time; the request hot path calls
SetBodyon the pre-allocated[]byte— zero upstream calls, zero hot-path allocations beyond what fasthttp's own response serialization needs (~5 allocs/req). - Configured
response_headersand CORS headers still apply to the static response. - Static entries win over upstream entries that share the same path (though duplicate paths are rejected at config load regardless).
Validation:
returnandupstreamare mutually exclusive per entry. Setting both or neither fails at startup.return.bodylarger than 64 KiB is rejected — move that payload to anupstreamentry instead.return.content_typewith CR/LF/NUL is rejected (header-injection defense).cache_ttlon areturnentry is rejected (the field is meaningless — the body is already resident in memory and never expires).- Duplicate paths across
locations[]are rejected, regardless of variant.
Cross-Origin Resource Sharing handled at the proxy, independently of what the backend would do. When enabled, the proxy short-circuits preflight (OPTIONS) requests locally and injects the appropriate Access-Control-* headers on simple responses. When disabled, the proxy is transparent and the backend can handle CORS itself.
{
"cors": {
"enabled": true,
"allowed_origins": ["https://app.example.com", "app://localhost"],
"allowed_methods": ["GET", "POST", "PUT", "DELETE", "OPTIONS"],
"allowed_headers": ["Content-Type", "Authorization"],
"exposed_headers": ["X-Request-Id"],
"allow_credentials": false,
"max_age": "10m"
}
}| Field | Required | Default | Description |
|---|---|---|---|
enabled |
yes | false |
Master switch. When false, the block is ignored and CORS is not applied. |
allowed_origins |
yes (when enabled) | — | Exact-match allowlist. Supports http://, https://, app:// (Cordova/hybrid mobile) schemes, the literal "null" (sandboxed iframes / file://), and the wildcard "*". Scheme/host compared case-insensitively per RFC 6454. |
allowed_methods |
no | [] |
Echoed in preflight Access-Control-Allow-Methods. Methods are normalized to upper case. Must be valid HTTP methods (GET, HEAD, POST, PUT, PATCH, DELETE, OPTIONS). |
allowed_headers |
no | [] |
Echoed in preflight Access-Control-Allow-Headers. When unset, the proxy echoes the value of the client's Access-Control-Request-Headers after validation (rs/cors default). Entries are header-name-validated (alphanumeric + hyphen) or "*". |
exposed_headers |
no | [] |
Sent as Access-Control-Expose-Headers on simple responses. Validated like allowed_headers. |
allow_credentials |
no | false |
When true, the response includes Access-Control-Allow-Credentials: true. Cannot be combined with "*" in allowed_origins or with "null" (both rejected at config load). |
max_age |
no | — | Preflight cache duration. Pre-formatted at parse time; range 0 to 24h. Omit or set to 0 to omit the header. |
Behavior details:
- Preflight (
OPTIONSwithAccess-Control-Request-Methodheader) is answered by the proxy with204 No Content— the FastCGI backend is never consulted. - Origin-allowed simple requests get
Access-Control-Allow-Origin: <origin>(exact echo) or*in wildcard-no-credentials mode; credentialed responses always echo the specific origin. Vary: Originis emitted on every CORS-sensitive response, including rejected preflights (403) and simple requests from disallowed origins, to prevent shared-cache poisoning.- Origin scheme validation:
http://,https://,app://, or literal"null". Anything else is rejected at parse time. - Port validation: hostnames with a
:portsuffix require a decimal port in1..65535. Malformed entries likeapp://loc:alhostare rejected. - Case-insensitive origin matching: browsers normalize origins to lowercase; the proxy lowercases the configured allowlist at parse time and does a zero-allocation fast path for already-lowercase request origins.
- Bypass paths:
/healthz,/readyz, and configuredlocations(both static-return and cached-upstream entries) are not CORS-gated. If you need CORS on those paths, handle it upstream.
Security hardening (already in place):
Access-Control-Request-Headersis validated for CR/LF/NUL before being echoed back, preventing response-splitting attacks.- When CORS is enabled,
Origin,Access-Control-Request-Method, andAccess-Control-Request-Headersare stripped from the CGI environment forwarded to PHP-FPM — the proxy is the single CORS authority. response_headersentries starting withAccess-Control-are rejected at config load when CORS is enabled (no silent override).allow_credentials: truecombined with"*"or"null"origins is rejected at parse time (classic CORS footguns).- The fasthttp request parser strips CR/LF from header values before the CORS middleware sees them; the NUL check is defense in depth.
Benchmark (AMD Ryzen 5 7600X, isolated middleware work):
| Path | ns/op | B/op | allocs/op |
|---|---|---|---|
| Preflight (allowed origin) | 401 | 0 | 0 |
| Simple cross-origin request | 181 | 0 | 0 |
| CORS disabled (fast-path return) | 2.6 | 0 | 0 |
| Origin case-fold (slow path) | 16.8 | 0 | 0 |
HTTP authentication applied at the proxy, in front of the FastCGI backend. Two schemes are supported: Digest (RFC 7616) and Basic (RFC 7617). Exactly one scheme is active at a time.
{
"auth": {
"enabled": true,
"type": "digest",
"realm": "fcgi-proxy",
"algorithm": "SHA-256",
"nonce_lifetime": "5m",
"users": [
{ "username": "alice", "ha1": "<64-hex for SHA-256, 32-hex for MD5>" }
]
}
}| Field | Required | Default | Description |
|---|---|---|---|
enabled |
yes | false |
Master switch. When false, the block is ignored. |
type |
no | digest |
digest or basic. |
realm |
yes (when enabled) | — | Sent in the WWW-Authenticate challenge. Must not contain CR, LF, NUL, or double-quote. |
algorithm |
digest only | SHA-256 |
SHA-256 (recommended) or MD5 (legacy, for old clients). |
nonce_lifetime |
digest only | 5m |
How long a server-issued nonce stays valid. 30s to 24h. |
users |
yes (when enabled) | — | Inline user database. Maximum 1000 entries. See per-scheme fields below. |
password_cache |
basic only | enabled with defaults | bcrypt result cache (see below). |
Bypass paths (not gated by auth):
/healthzand/readyz— load-balancer and Kubernetes probes stay reachable.- CORS preflight (
OPTIONSwithAccess-Control-Request-Method) — browsers strip theAuthorizationheader from preflights, so gating them would break every CORS-using client. The CORS middleware handles preflights before the auth gate. - Configured
locations— bothreturn(inline static) and cached-upstream entries. If a path is in yourlocations, it serves without auth.
Everything else — the FastCGI backend, script resolution, the entire index.php front-controller — requires valid credentials.
Credentials are stored as HA1 hex (H(username:realm:password)), never plaintext. Compute with:
# SHA-256 (default)
printf '%s' 'alice:fcgi-proxy:s3cret' | sha256sum
# MD5 (legacy)
printf '%s' 'alice:fcgi-proxy:s3cret' | md5sum| User field | Description |
|---|---|
username |
Must not contain :, CR, LF, NUL, or double-quote. |
ha1 |
Lowercase hex of H(username:realm:password). 64 chars for SHA-256, 32 for MD5. |
Nonce design: stateless, HMAC-signed. The proxy does not keep a nonce store; nonces are self-verifying base64 of timestamp || 16 random bytes || HMAC-SHA-256(secret, timestamp||random)[:16]. On process restart, the HMAC secret is regenerated — clients present the old (now-invalid) nonce, the server responds with stale=true, and compliant clients re-auth silently without reprompting.
Hardening in place:
qop=authis required; the RFC 2069 qop-absent fallback is rejected to prevent downgrade.- Client-supplied
algorithm=parameter must match the configured algorithm, preventing hash-function downgrade. - Client-supplied
uri=must match the actual request target (RFC 7616 §3.4) — prevents replay of a capturedAuthorizationheader against a different URI within the nonce lifetime. - Unknown-username responses run a dummy HMAC pass to equalize response time with wrong-password responses — no user enumeration via timing.
- Response compare uses
crypto/subtle.ConstantTimeCompare; realm compare uses the same. - Header parser tolerates
key ="value"(trailing whitespace) and unescapes\"in quoted values.
Trade-off: no nc (nonce-count) tracking — a captured valid Authorization header can be replayed within the nonce lifetime. Deploy over TLS, and consider a short nonce_lifetime (e.g. 30s) for high-value endpoints on plain HTTP.
Credentials are stored as bcrypt hashes. Generate with htpasswd:
# Cost 10 (default for htpasswd -B)
htpasswd -B -n alice
# Explicit cost (12 recommended for sensitive endpoints)
htpasswd -B -C 12 -n alice| User field | Description |
|---|---|
username |
Must not contain :, CR, LF, NUL, or double-quote. |
password_hash |
bcrypt hash starting with $2a$, $2b$, or $2y$. Plaintext is rejected at config load. |
Config example:
{
"auth": {
"enabled": true,
"type": "basic",
"realm": "fcgi-proxy",
"users": [
{ "username": "alice", "password_hash": "$2b$10$..." }
],
"password_cache": {
"enabled": true,
"ttl": "1m",
"max_entries": 10000
}
}
}Hardening in place:
- Unknown-username path runs a dummy bcrypt compare at the same cost as the slowest configured hash — determined at parse time. No user enumeration via timing.
- Plaintext passwords are rejected at config load (prefix check: must start with
$2a$,$2b$, or$2y$). - Empty passwords are allowed if the stored hash matches, per RFC 7617.
- Passwords containing
:work correctly (split on the first:only). - Base64 decode uses a 512-byte stack buffer; oversized
Authorizationheaders are rejected without spilling to heap.
bcrypt verification at cost 10 takes ~50 ms. Without a cache, a single authenticated request caps a CPU core at ~10–20 RPS — a 1000× throughput regression vs the unauthenticated case. The proxy includes a Caddy-style in-memory cache to avoid re-running bcrypt on every request.
| Field | Default | Description |
|---|---|---|
password_cache.enabled |
true |
Set to false to disable the cache and pay the full bcrypt cost on every request. |
password_cache.ttl |
1m |
How long a successful auth stays cached. Range 1s to 1h. |
password_cache.max_entries |
10000 |
Upper bound on cached entries. Range 1 to 1,000,000. |
Design:
- Cache only successful authentications. Failures and unknown users always run bcrypt, so the cache cannot accelerate brute-force attacks.
- Keys are HMAC-SHA-256(secret,
stored_hash || 0x00 || password) with a per-cache 32-byte random secret. Binding to the stored hash means rotating a password automatically orphans prior entries. The HMAC secret makes a partial memory dump (map bytes only) useless for offline password cracking. - Eviction is lazy: on
setat capacity, expired entries are dropped first, then — if still full — half the map is bulk-dropped (map iteration order as pseudo-random eviction). No background goroutines. - Hot path: atomic-counter RLock + map lookup +
time.Now()compare. Zero allocations, 12 ns/op. - HMAC derivation also zero-allocation via pre-computed inner/outer pads and stack-buffered SHA-256.
Performance (AMD Ryzen 5 7600X, bcrypt.MinCost):
| Path | ns/op | allocs/op |
|---|---|---|
| Cache hit (Authorization → success) | 643 | 8 |
| Cache miss (bcrypt MinCost verify) | 673,020 | 19 |
Cache hit isolated (check only) |
12 | 0 |
| HMAC key derivation isolated | 217 | 0 |
Speedup: 1046× at MinCost, scaling linearly with bcrypt cost. At operator cost 10 (~50 ms bcrypt), the speedup approaches ~77,000×.
Operator note on password rotation: the cache is rebuilt when the process restarts (new HMAC secret, empty map). Editing config.json in place without restarting the process will keep cached credentials valid for up to password_cache.ttl (default 1 minute). For immediate rotation, send SIGTERM and start the proxy again.
Timeouts use Go duration strings: 100ms, 5s, 1m30s, 1h, etc. All timeouts must be between 100ms and 5m. Cache TTLs have no upper bound (use 0 to disable caching and always fetch).
At startup, the proxy validates:
listenis a validhost:portnetworkis one oftcp,tcp4,tcp6,unixaddressis not emptydocument_rootis an absolute pathindexis a plain filename (no/,\, or null bytes)- All timeouts are valid durations within bounds
max_body_sizeis between 1 and 256 MBmax_concurrencyis between 1 and 65535pool_max_idleis between 1 and 1024pool_idle_timeoutis between 1s and 5mresponse_headerskeys are alphanumeric/hyphens; values have no CR/LF/nulllocationspaths start with/; upstreams arehttp://orhttps://without credentials; TTLs are non-negative; maximum 100 locations; each entry is eitherupstream- orreturn-shaped (never both)corsorigins havehttp:///https:///app://schemes with valid host[:port];allow_credentials+"*"or"null"rejected;max_age≤ 24h;response_headerscannot containAccess-Control-*keys while CORS is enabledauthrealm is non-empty and free of CR/LF/NUL/quote; digest users supply HA1 matching the algorithm hash size; basic users supply bcrypt-prefixed password hashes;password_cacheapplies only to Basic
Invalid configuration causes the proxy to exit with a clear error message.
Two separate endpoints, each with a distinct purpose:
Returns 200 OK with body ok. Never touches PHP-FPM, the location cache, or authentication. Answers the single question: is the proxy process still running? Use this for Kubernetes livenessProbe so a temporarily-down PHP-FPM does not cause the pod to be restarted.
Sends a FastCGI request to PHP-FPM's built-in status handler (pm.status_path) and returns:
200 OK/ bodyready— PHP-FPM answered with a 2xx.503 Service Unavailable/ bodynot ready— upstream unreachable, timed out, or returned a non-2xx.
Use this for Kubernetes readinessProbe so traffic is paused when the upstream is down without killing the pod.
Behavior notes:
- One automatic retry with a short backoff. The FastCGI client already retries once on a stale pooled socket;
/readyzadds a second top-level retry so a PHP-FPM restart blip does not flap readiness. - Dedicated FastCGI client with the configured
readiness.timeoutas its dial/read/write deadline, so probes stay bounded even when the main request timeouts are generous. - Upstream body is never echoed — the status page reply is discarded so the endpoint cannot leak worker-pool internals.
- Auth / CORS / locations are bypassed — probes are meant for load balancers, not browsers.
- Disable with
readiness.enabled: false—/readyzthen mirrors/healthzand always returns 200.
PHP-FPM side configuration required: add pm.status_path = /status (or your chosen path) to your PHP-FPM pool config. Match the path to readiness.status_path.
; /usr/local/etc/php-fpm.d/www.conf
pm.status_path = /statusSets the proxy to draining mode. Once triggered:
/readyzflips to503/ bodydrainingwithout probing upstream — Kubernetes removes the pod from the Service endpoints.- Every HTTP/1 response gets
Connection: close— load balancers and keepalive clients rotate away before the pod dies. /healthzstays200so Kubernetes liveness does not restart the pod mid-drain.- Idempotent — repeat calls return
200 drainingwithout side effects.
Wire this from a Kubernetes preStop hook so the pod bleeds off in-flight requests cleanly:
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "wget -qO- http://127.0.0.1:8080/healthz/fail || true; sleep 15"]Combine with terminationGracePeriodSeconds greater than your longest request plus the preStop sleep (example uses 45s).
Returns live or draining so operators can confirm the preStop hook reached the proxy.
Disable with readiness.drain_enabled: false. /healthz/fail then returns 404 so an operator who wires the hook without enabling the feature finds out immediately instead of silently shipping no-op traffic.
Security — IP allowlist: /healthz/fail enforces readiness.drain_trusted_cidrs (default ["127.0.0.0/8", "::1/128"]). Non-matching remotes get 403 forbidden. This is defence-in-depth; path-level isolation at the ingress/NetworkPolicy is still the primary mitigation. Set the list to [] to disable the check, or add CNI/pod ranges if drain is coordinated from a sibling pod. The /healthz/drain-status endpoint is intentionally unrestricted — it only reports a state /readyz already reveals.
The proxy maintains a pool of reusable TCP/Unix connections to PHP-FPM, eliminating the TCP handshake overhead on every request:
- LIFO ordering — most recently used connection returned first (most likely alive)
- No liveness probe — dead connections are detected on write and discarded (eliminates 3 syscalls per reuse)
keepConn=true— tells PHP-FPM to keep connections open after each request- Background eviction — idle connections are cleaned up automatically
- Configurable —
pool_max_idle(default 32) andpool_idle_timeout(default 30s)
For best results, set pool_max_idle to match or exceed your PHP-FPM pm.max_children.
- Pooled buffers —
bufio.Writer,bufio.Reader,bytes.Buffer(stdout/stderr), record content buffers, and stdin streaming buffers are all reused viasync.Pool - Direct buffer writes —
ReadRecordIntoappends FastCGI stdout/stderr content directly into the response buffer, eliminating per-record intermediate copies - Oversized buffer eviction — pooled
bytes.Bufferinstances larger than 1 MB are discarded instead of returned to the pool, preventing high-water-mark retention - Zero-allocation header processing — blocked-header checks and CGI env key construction use fixed-size stack buffers with no heap allocations
- Pre-sized maps — CGI params map pre-allocated to 28 entries; response headers map pre-allocated to 8 entries
- Pre-estimated encoding buffer —
EncodeParamsestimates total byte count upfront, reducing append-growth from ~5 allocations to 1 - Direct MIMEHeader reuse —
textproto.MIMEHeaderis cast directly tomap[string][]stringinstead of copied
| Operation | ns/op | B/op | allocs/op |
|---|---|---|---|
| EncodeParams (18 keys) | 425 | 576 | 1 |
| WriteRecord (1 KB) | 23 | 8 | 1 |
| ReadRecord (1 KB) | 274 | 1093 | 3 |
| WriteStreamFromReader (8 KB) | 169 | 56 | 2 |
| ParseHTTPResponse | 706 | 1136 | 12 |
| ReadResponse (full round-trip) | 615 | 1110 | 12 |
| BuildEnvKey | 16 | 0 | 0 |
| IsBlockedHeader | 13 | 0 | 0 |
Both proxying to the same PHP-FPM backend (50 children, pm = static), 10-second test duration.
Throughput (requests in 10s):
| Test | fcgi-proxy | nginx | Diff |
|---|---|---|---|
| Minimal JSON (GET, 50c) | 254,297 | 168,320 | +51% |
| Front-controller (GET /, 50c) | 257,313 | 161,263 | +60% |
| Heavy workload (GET, 10KB resp, 50c) | 147,693 | 126,056 | +17% |
| POST with body (50c) | 246,976 | 161,365 | +53% |
| Health check (50c) | 913,235 | 888,454 | +3% |
| High concurrency (GET, 200c) | 259,729 | 152,274 | +71% |
Latency p50 (ms):
| Test | fcgi-proxy | nginx |
|---|---|---|
| Minimal JSON | 0.6 | 1.5 |
| Front-controller | 0.9 | 1.8 |
| Heavy workload | 2.2 | 3.5 |
| POST with body | 0.9 | 1.9 |
| Health check | 0.5 | 0.5 |
| High concurrency (200c) | 1.7 | 12.5 |
A full benchmark suite is included in benchmark/ that runs fcgi-proxy and nginx side-by-side against the same PHP-FPM backend. It tests 6 scenarios: minimal JSON, front-controller routing, heavy workload (~10 KB response), POST with body, health check, and high concurrency (200 connections).
Requirements: Docker, Docker Compose, hey (go install github.com/rakyll/hey@latest)
cd benchmark
# Start PHP-FPM (50 children), fcgi-proxy, and nginx
docker compose up -d
# Run all tests (default: 10s duration, 50 concurrency)
./run.sh
# Custom duration and concurrency
./run.sh 30s 100
# Clean up
docker compose downThe benchmark stack:
php-fpm— PHP 8.3 FPM Alpine withpm.max_children = 50fcgi-proxy— built from source, listening on port 8081nginx— nginx 1.27 Alpine with equivalent FastCGI config, listening on port 8082- Both proxies connect to the same PHP-FPM container over TCP
Included PHP test scripts:
www/index.php— minimal JSON response (~60 bytes)www/heavy.php— 100 users with md5 hashes (~10 KB response)www/echo.php— echoes POST body metadata
go test -bench=. -benchmem ./fcgi/ ./proxy/The proxy sets the following CGI environment variables for each request:
| Parameter | Source |
|---|---|
GATEWAY_INTERFACE |
FastCGI/1.0 |
SERVER_PROTOCOL |
Actual request protocol (HTTP/1.0 or HTTP/1.1) |
SERVER_SOFTWARE |
fcgi-proxy |
SERVER_NAME |
Host header (port stripped, null bytes rejected) |
SERVER_PORT |
Derived from listen address |
REQUEST_METHOD |
From the HTTP request |
REQUEST_URI |
Full request URI including query string |
SCRIPT_NAME |
Resolved PHP script path |
SCRIPT_FILENAME |
Absolute path: document_root + script_name |
PATH_INFO |
Extra path after .php (cleaned of .. sequences) |
QUERY_STRING |
From the URI |
DOCUMENT_ROOT |
From configuration |
DOCUMENT_URI |
SCRIPT_NAME + PATH_INFO |
REMOTE_ADDR |
Client IP (IPv6 brackets stripped) |
REMOTE_PORT |
Client port |
CONTENT_TYPE |
From Content-Type header |
CONTENT_LENGTH |
From actual body length (not the header value) |
HTTPS |
on if the connection is TLS |
HTTP_* |
All client headers (except blocked ones) |
HTTP_X_FORWARDED_FOR |
Authoritative client IP (client-supplied value stripped) |
HTTP_X_REAL_IP |
Authoritative client IP (client-supplied value stripped) |
The following client headers are stripped before forwarding to prevent spoofing:
Proxy(httpoxy CVE-2016-5385)X-Forwarded-For(replaced with authoritative value)X-Real-IP(replaced with authoritative value)Connection,Transfer-Encoding,Trailer(hop-by-hop)Content-Type,Content-Length(set explicitly from actual values)
The proxy resolves PHP scripts using a front-controller pattern compatible with Laravel, Symfony, WordPress, and similar frameworks:
| Request URI | SCRIPT_NAME | PATH_INFO |
|---|---|---|
/index.php |
/index.php |
|
/index.php/api/users |
/index.php |
/api/users |
/ |
/index.php |
|
/admin/ |
/admin/index.php |
|
/api/users |
/index.php |
/api/users |
/admin/dashboard.php |
/admin/dashboard.php |
Extension matching is case-insensitive (.php, .PHP, .Php all work).
The proxy implements multiple layers of defense:
- Path traversal:
filepath.Clean+filepath.Relboundary check ensuresSCRIPT_FILENAMEnever escapes the document root - Null byte injection: Requests with null bytes in URI path or query string are rejected with 400
- httpoxy: The
Proxyheader is unconditionally stripped - Header injection: CGI env keys are validated via single-pass byte-level filter (alphanumeric + hyphen only, must contain at least one letter, max 251 chars); response header keys/values are validated at config load
- IP spoofing: Client-supplied
X-Forwarded-For/X-Real-IPare stripped; authoritative values injected from the actual TCP connection - SSRF: Location cache upstreams are blocked from connecting to private, loopback, link-local, and cloud metadata addresses; DNS resolution is checked; redirects limited to 5 hops per request
- Hop-by-hop leaking:
Connection,Transfer-Encoding,Keep-Alive,TE,Trailer,Upgrade,Proxy-Authenticate,Proxy-Authorizationare filtered from upstream responses (case-insensitive) - Error isolation: Internal errors are logged server-side; clients receive only generic
502 Bad Gateway; upstream URLs in logs have credentials stripped - Host header sanitization: Null bytes and control characters in the
Hostheader are rejected - Server identity: The
Serverresponse header is suppressed - Body limits: Configurable
max_body_size(up to 256 MB); FastCGI upstream response capped at 128 MB; location cache upstream capped at 10 MB - Timeout enforcement: All network operations have bounded deadlines (100ms to 5m)
- Concurrency cap: Configurable
max_concurrency(up to 65535) - FastCGI protocol: EndRequest
protocolStatusis checked; non-complete statuses return 502 - Cache safety: Only 200 responses are cached; stale fallback limited to 24 hours with automatic eviction; singleflight prevents cache stampede
- Memory safety: Pooled buffers are capped at 1 MB before returning to pool; oversized buffers are discarded to prevent high-water-mark retention
- Container security: Dockerfile runs as
nobody; Helm chart setsrunAsNonRoot,readOnlyRootFilesystem,allowPrivilegeEscalation: false,drop: ALL
go test -race -cover ./...The test suite includes:
- Unit tests for FastCGI protocol encoding/decoding, params, records
- Config validation tests (all fields, edge cases, error paths)
- Location cache tests (fetch, cache hit, TTL expiry, stale fallback, stale eviction, singleflight dedup, SSRF guard, non-200 rejection)
- Integration tests with a mock FastCGI server (full HTTP-to-FastCGI round-trip, header filtering, body forwarding, health check, response headers, null byte rejection)
- Path traversal attack tests
- Benchmarks for all hot-path operations
MIT